Search CORE

693 research outputs found

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

Author: Ernst Damien
Fonteneau Raphael
François-Lavet Vincent
Publication venue
Publication date: 01/12/2015
Field of study

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma.Comment: NIPS 2015 Deep Reinforcement Learning Worksho

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

Author: Boigelot Bernard
Ernst Damien
Fonteneau Raphael
Louveaux Quentin
Publication venue
Publication date: 01/01/2012
Field of study

We study the minmax optimization problem introduced in [22] for computing policies for batch mode reinforcement learning in a deterministic setting. First, we show that this problem is NP-hard. In the two-stage case, we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [22]

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

Benchmarking for Bayesian Reinforcement Learning

Author: Castronovo Michael
Couetoux Adrien
Ernst Damien
Fonteneau Raphael
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 14/09/2015
Field of study

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but even though a few toy examples exist in the literature, there are still no extensive or rigorous benchmarks to compare them. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.Comment: 37 page

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Open Repository and Bibliography - Liège

FigShare

On overfitting and asymptotic bias in batch reinforcement learning with partial observability

Author: Ernst Damien
Fonteneau Raphael
Francois-Lavet Vincent
Pineau Joelle
Rabusseau Guillaume
Publication venue
Publication date: 06/02/2019
Field of study

This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability. Our theoretical analysis formally characterizes that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. This analysis relies on expressing the quality of a state representation by bounding L1 error terms of the associated belief states. Theoretical results are empirically illustrated when the state representation is a truncated history of observations, both on synthetic POMDPs and on a large-scale POMDP in the context of smartgrids, with real-world data. Finally, similarly to known results in the fully observable setting, we also briefly discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting in the partially observable context.Comment: Accepted at the Journal of Artificial Intelligence Research (JAIR) - 31 page

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

Cybersecurity in Power Grids: Challenges and Opportunities

Author: Ernst Raphael
Hacker Immanuel
Henze Martin
Klaer Benedikt
Krause Tim
Publication venue
Publication date: 01/01/2021
Field of study

Increasing volatilities within power transmission and distribution force power grid operators to amplify their use of communication infrastructure to monitor and control their grid. The resulting increase in communication creates a larger attack surface for malicious actors. Indeed, cyber attacks on power grids have already succeeded in causing temporary, large-scale blackouts in the recent past. In this paper, we analyze the communication infrastructure of power grids to derive resulting fundamental challenges of power grids with respect to cybersecurity. Based on these challenges, we identify a broad set of resulting attack vectors and attack scenarios that threaten the security of power grids. To address these challenges, we propose to rely on a defense-in-depth strategy, which encompasses measures for (i) device and application security, (ii) network security, and (iii) physical security, as well as (iv) policies, procedures, and awareness. For each of these categories, we distill and discuss a comprehensive set of state-of-the art approaches, as well as identify further opportunities to strengthen cybersecurity in interconnected power grids

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Repositorium für Naturwissenschaften und Technik

Publikationsserver der RWTH Aachen University

Thin crystalline macroporous silicon solar cells with ion implanted emitter

Author: Brendel Rolf
Ernst Marco
Kajari-Schröder Sarah
Niepelt Raphael
Schulte-Huxel Henning
Publication venue: Amsterdam : Elsevier
Publication date: 01/01/2013
Field of study

We separate a (34 ± 2) μm-thick macroporous Si layer from an n-type Si wafer by means of electrochemical etching. The porosity is p = (26.2 ± 2.4)%. We use ion implantation to selectively dope the outer surfaces of the macroporous Si layer. No masking of the surface is required. The pores are open during the implantation process. We fabricate a macroporous Si solar cell with an implanted boron emitter at the front side and an implanted phosphorus region at the rear side. The short-circuit current density is 34.8 mA cm-2 and the open-circuit voltage is 562 mV. With a fill factor of 69.1% the cell achieves an energy-conversion efficiency of 13.5%.Federal Ministry for Environment, Nature Conservation, and Nuclear Safety/FKZ 032514

Elsevier - Publisher Connector

Institutionelles Repositorium der Leibniz Universität Hannover

Multiple Slips in Atomic-Scale Friction: An Indicator for the Lateral Contact Damping

Author: Baratoff Alexis
Glatzel Thilo
Gnecco Enrico
Meyer Ernst
Roth Raphael
Steiner Pascal
Publication venue
Publication date: 18/06/2018
Field of study

The occurrence of multiple jumps in 2D atomic-scale friction measurements is used to quantify the viscous damping accompanying the stick-slip motion of a sharp tip in contact with a NaCl(001) surface. Multiple slips are observed without apparent wear for normal forces between 13 and 91nN. For scans parallel to [100] directions, the tip jumps between minima of the substrate corrugation potential in a zigzag fashion. An algorithm is applied to determine histograms of lateral force jumps which characterize multiple slips. The same algorithm is used to classify multiple slips occurring in calculated lateral force maps. Comparisons between simulations and experiments indicate that the nanometer-sized contact is underdamped at intermediate loads (13-26nN) and becomes slightly overdamped at higher loads. The proposed procedure is a novel way to estimate the lateral contact damping which plays an important role in the interpretation of measurements of the velocity and temperature dependence of friction, of slip duration, and of the reduction of friction by applied perpendicular or parallel oscillation

RERO DOC Digital Library

9.糖尿病患者におけるグラム陰性桿菌敗血症の2症例(第585回千葉医学会例会・第1内科教室同門会例会)

Author: Adrien Couëtoux (2837570)
Damien Ernst (2837564)
Michael Castronovo (2837567)
Raphael Fonteneau (2837573)
Publication venue: 千葉医学会
Publication date
Field of study

<p>Offline computation cost Vs. Performance (inaccurate case).</p

FigShare

A bibliometric analysis of orthogeriatric care: top 50 articles.

Author: Bastian Johannes Dominik
Ernst Raphael Simon
Gieger Jochen
Meier Malin Kristin
Stuck Andreas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

BACKGROUND Population is ageing and orthogeriatric care is an emerging research topic. PURPOSE This bibliometric review aims to provide an overview, to investigate the status and trends in research in the field of orthogeriatric care of the most influential literature. METHODS From the Core Collection databases in the Thomson Reuters Web of Knowledge, the most influential original articles with reference to orthogeriatric care were identified in December 2020 using a multistep approach. A total of 50 articles were included and analysed in this bibliometric review. RESULTS The 50 most cited articles were published between 1983 and 2017. The number of total citations per article ranged from 34 to 704 citations (mean citations per article: n = 93). Articles were published in 34 different journals between 1983 and 2017. In the majority of publications, geriatricians (62%) accounted for the first authorship, followed by others (20%) and (orthopaedic) surgeons (18%). Articles mostly originated from Europe (76%), followed by Asia-pacific (16%) and Northern America (8%). Key countries (UK, Sweden, and Spain) and key topic (hip fracture) are key drivers in the orthogeriatric research. The majority of articles reported about therapeutic studies (62%). CONCLUSION This bibliometric review acknowledges recent research. Orthogeriatric care is an emerging research topic in which surgeons have a potential to contribute and other topics such as intraoperative procedures, fractures other than hip fractures or elective surgery are related topics with the potential for widening the field to research

PubMed Central

Bern Open Repository and Information System (BORIS)